Clustering Using a Similarity Measure Based on Shared Near Neighbors

نویسندگان

  • Ray A. Jarvis
  • Edward A. Patrick
چکیده

A nonparametric clustering technique incorporating the concept of similarity based on the sharing of near neighbors is presented. In addition to being an essentially paraliel approach, the computational elegance of the method is such that the scheme is applicable to a wide class of practical problems involving large sample size and high dimensionality. No attempt is made to show how a priori problem knowledge can be introduced into the procedure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Adaptive Spectral Clustering Algorithm Based on the Importance of Shared Nearest Neighbors

The construction of a similarity matrix is one significant step for the spectral clustering algorithm; while the Gaussian kernel function is one of the most common measures for constructing the similarity matrix. However, with a fixed scaling parameter, the similarity between two data points is not adaptive and appropriate for multi-scale datasets. In this paper, through quantitating the value ...

متن کامل

Spatio-Temporal Outlier Detection Technique

Outlier detection is very important functionality of data mining, it has enormous applications. This paper proposes a clustering based approach for outlier detection using spatio-temporal data. It uses three step approach to detect spatiotemporal outliers. In the first step of outlier detection, clustering is performed on the spatio-temporal dataset with proposed Spatio-Temporal Shared Nearest ...

متن کامل

Finding Sequence Clusters: A Shared Near Neighbors Approach

Sequence clustering is one of most fundamental topics which can be applied in various research field. Most of previous work on sequence clustering is dedicated to the single-label clustering in which the whole similarity of equal-length sequence is considered and measured by Euclidean distance function. However, intrinsic properties behind sequence demand the multi-label clustering. In addition...

متن کامل

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

Improving Imbalanced data classification accuracy by using Fuzzy Similarity Measure and subtractive clustering

 Classification is an one of the important parts of data mining and knowledge discovery. In most cases, the data that is utilized to used to training the clusters is not well distributed. This inappropriate distribution occurs when one class has a large number of samples but while the number of other class samples is naturally inherently low. In general, the methods of solving this kind of prob...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Computers

دوره 22  شماره 

صفحات  -

تاریخ انتشار 1973